visualization tool
Exploring Urban Factors with Autoencoders: Relationship Between Static and Dynamic Features
Pocco, Ximena, Hassan, Waqar, Salinas, Karelia, Molchanov, Vladimir, Nonato, Luis G.
Urban analytics utilizes extensive datasets with diverse urban information to simulate, predict trends, and uncover complex patterns within cities. While these data enables advanced analysis, it also presents challenges due to its granularity, heterogeneity, and multimodality. To address these challenges, visual analytics tools have been developed to support the exploration of latent representations of fused heterogeneous and multimodal data, discretized at a street-level of detail. However, visualization-assisted tools seldom explore the extent to which fused data can offer deeper insights than examining each data source independently within an integrated visualization framework. In this work, we developed a visualization-assisted framework to analyze whether fused latent data representations are more effective than separate representations in uncovering patterns from dynamic and static urban data. The analysis reveals that combined latent representations produce more structured patterns, while separate ones are useful in particular cases.
Summarizing Normative Driving Behavior From Large-Scale NDS Datasets for Vehicle System Development
This paper presents a methodology to process large-scale naturalistic driving studies (NDS) to describe the driving behavior for five vehicle metrics, including speed, speeding, lane keeping, following distance, and headway, contextualized by roadway characteristics, vehicle classes, and driver demographics. Such descriptions of normative driving behaviors can aid in the development of vehicle safety and intelligent transportation systems. The methodology is demonstrated using data from the Second Strategic Highway Research Program (SHRP 2) NDS, which includes over 34 million miles of driving across more than 3,400 drivers. Summaries of each driving metric were generated using vehicle, GPS, and forward radar data. Additionally, interactive online analytics tools were developed to visualize and compare driving behavior across groups through dynamic data selection and grouping. For example, among drivers on 65-mph roads for the SHRP 2 NDS, females aged 16-19 exceeded the speed limit by 7.5 to 15 mph slightly more often than their male counterparts, and younger drivers maintained headways under 1.5 seconds more frequently than older drivers. This work supports better vehicle systems and safer infrastructure by quantifying normative driving behaviors and offers a methodology for analyzing NDS datasets for cross group comparisons.
Embedding Atlas: Low-Friction, Interactive Embedding Visualization
Ren, Donghao, Hohman, Fred, Lin, Halden, Moritz, Dominik
Embedding projections are popular for visualizing large datasets and models. However, people often encounter "friction" when using embedding visualization tools: (1) barriers to adoption, e.g., tedious data wrangling and loading, scalability limits, no integration of results into existing workflows, and (2) limitations in possible analyses, without integration with external tools to additionally show coordinated views of metadata. In this paper, we present Embedding Atlas, a scalable, interactive visualization tool designed to make interacting with large embeddings as easy as possible. Embedding Atlas uses modern web technologies and advanced algorithms -- including density-based clustering, and automated labeling -- to provide a fast and rich data analysis experience at scale. We evaluate Embedding Atlas with a competitive analysis against other popular embedding tools, showing that Embedding Atlas's feature set specifically helps reduce friction, and report a benchmark on its real-time rendering performance with millions of points. Embedding Atlas is available as open source to support future work in embedding-based analysis.
ParaView-MCP: An Autonomous Visualization Agent with Direct Tool Use
Liu, Shusen, Miao, Haichao, Bremer, Peer-Timo
While powerful and well-established, tools like ParaView present a steep learning curve that discourages many potential users. This work introduces ParaView-MCP, an autonomous agent that integrates modern multimodal large language models (MLLMs) with ParaView to not only lower the barrier to entry but also augment ParaView with intelligent decision support. By leveraging the state-of-the-art reasoning, command execution, and vision capabilities of MLLMs, ParaView-MCP enables users to interact with ParaView through natural language and visual inputs. Specifically, our system adopted the Model Context Protocol (MCP) - a standardized interface for model-application communication - that facilitates direct interaction between MLLMs with ParaView's Python API to allow seamless information exchange between the user, the language model, and the visualization tool itself. Furthermore, by implementing a visual feedback mechanism that allows the agent to observe the viewport, we unlock a range of new capabilities, including recreating visualizations from examples, closed-loop visualization parameter updates based on user-defined goals, and even cross-application collaboration involving multiple tools. Broadly, we believe such an agent-driven visualization paradigm can profoundly change the way we interact with visualization tools. We expect a significant uptake in the development of such visualization tools, in both visualization research and industry.
DetoxAI: a Python Toolkit for Debiasing Deep Learning Models in Computer Vision
Stępka, Ignacy, Sztukiewicz, Lukasz, Wiliński, Michał, Stefanowski, Jerzy
While machine learning fairness has made significant progress in recent years, most existing solutions focus on tabular data and are poorly suited for vision-based classification tasks, which rely heavily on deep learning. To bridge this gap, we introduce DetoxAI, an open-source Python library for improving fairness in deep learning vision classifiers through post-hoc debiasing. DetoxAI implements state-of-the-art debiasing algorithms, fairness metrics, and visualization tools. It supports debiasing via interventions in internal representations and includes attribution-based visualization tools and quantitative algorithmic fairness metrics to show how bias is mitigated. This paper presents the motivation, design, and use cases of DetoxAI, demonstrating its tangible value to engineers and researchers.
EXACT-CT: EXplainable Analysis for Crohn's and Tuberculosis using CT
Gupta, Shashwat, Gupta, Sarthak, Agrawal, Akshan, Naaz, Mahim, Yadav, Rajanikanth, Bagade, Priyanka
Crohn's disease and intestinal tuberculosis share many overlapping features such as clinical, radiological, endoscopic, and histological features - particularly granulomas, making it challenging to clinically differentiate them. Our research leverages 3D CTE scans, computer vision, and machine learning to improve this differentiation to avoid harmful treatment mismanagement such as unnecessary anti-tuberculosis therapy for Crohn's disease or exacerbation of tuberculosis with immunosuppressants. Our study proposes a novel method to identify radiologist - identified biomarkers such as VF to SF ratio, necrosis, calcifications, comb sign and pulmonary TB to enhance accuracy. We demonstrate the effectiveness by using different ML techniques on the features extracted from these biomarkers, computing SHAP on XGBoost for understanding feature importance towards predictions, and comparing against SOTA methods such as pretrained ResNet and CTFoundation.
Visualizing Machine Learning Models for Enhanced Financial Decision-Making and Risk Management
Ganguly, Priyam, Garine, Ramakrishna, Mukherjee, Isha
This study emphasizes how crucial it is to visualize machine learning models, especially for the banking industry, in order to improve interpretability and support predictions in high stakes financial settings. Visual tools enable performance improvements and support the creation of innovative financial models by offering crucial insights into the algorithmic decision-making processes. Within a financial machine learning framework, the research uses visually guided experiments to make important concepts, such risk assessment and portfolio allocation, more understandable. The study also examines variations in trading tactics and how they relate to risk appetite, coming to the conclusion that the frequency of portfolio rebalancing is negatively correlated with risk tolerance. Finding these ideas is made possible in large part by visualization. The study concludes by presenting a novel method of locally stochastic asset weighing, where visualization facilitates data extraction and validation. This highlights the usefulness of these methods in furthering the field of financial machine learning research.
ScriptViz: A Visualization Tool to Aid Scriptwriting based on a Large Movie Database
Rao, Anyi, Chou, Jean-Peïc, Agrawala, Maneesh
Scriptwriters usually rely on their mental visualization to create a vivid story by using their imagination to see, feel, and experience the scenes they are writing. Besides mental visualization, they often refer to existing images or scenes in movies and analyze the visual elements to create a certain mood or atmosphere. In this paper, we develop ScriptViz to provide external visualization based on a large movie database for the screenwriting process. It retrieves reference visuals on the fly based on scripts' text and dialogue from a large movie database. The tool provides two types of control on visual elements that enable writers to 1) see exactly what they want with fixed visual elements and 2) see variances in uncertain elements. User evaluation among 15 scriptwriters shows that ScriptViz is able to present scriptwriters with consistent yet diverse visual possibilities, aligning closely with their scripts and helping their creation.
AEye: A Visualization Tool for Image Datasets
Grötschla, Florian, Lanzendörfer, Luca A., Calzavara, Marco, Wattenhofer, Roger
Image datasets serve as the foundation for machine learning models in computer vision, significantly influencing model capabilities, performance, and biases alongside architectural considerations. Therefore, understanding the composition and distribution of these datasets has become increasingly crucial. To address the need for intuitive exploration of these datasets, we propose AEye, an extensible and scalable visualization tool tailored to image datasets. AEye utilizes a contrastively trained model to embed images into semantically meaningful high-dimensional representations, facilitating data clustering and organization. To visualize the high-dimensional representations, we project them onto a two-dimensional plane and arrange images in layers so users can seamlessly navigate and explore them interactively. AEye facilitates semantic search functionalities for both text and image queries, enabling users to search for content. We open-source the codebase for AEye, and provide a simple configuration to add datasets.
By My Eyes: Grounding Multimodal Large Language Models with Sensor Data via Visual Prompting
Yoon, Hyungjun, Tolera, Biniyam Aschalew, Gong, Taesik, Lee, Kimin, Lee, Sung-Ju
Large language models (LLMs) have demonstrated exceptional abilities across various domains. However, utilizing LLMs for ubiquitous sensing applications remains challenging as existing text-prompt methods show significant performance degradation when handling long sensor data sequences. We propose a visual prompting approach for sensor data using multimodal LLMs (MLLMs). We design a visual prompt that directs MLLMs to utilize visualized sensor data alongside the target sensory task descriptions. Additionally, we introduce a visualization generator that automates the creation of optimal visualizations tailored to a given sensory task, eliminating the need for prior task-specific knowledge. We evaluated our approach on nine sensory tasks involving four sensing modalities, achieving an average of 10% higher accuracy than text-based prompts and reducing token costs by 15.8x. Our findings highlight the effectiveness and cost-efficiency of visual prompts with MLLMs for various sensory tasks.